CDMO: Chaotic Dwarf Mongoose Optimization Algorithm for feature selection

In this paper, a modified version of Dwarf Mongoose Optimization Algorithm (DMO) for feature selection is proposed. DMO is a novel technique of the swarm intelligence algorithms which mimic the foraging behavior of the Dwarf Mongoose. The developed method, named Chaotic DMO (CDMO), is considered a wrapper-based model which selects optimal features that give higher classification accuracy. To speed up the convergence and increase the effectiveness of DMO, ten chaotic maps were used to modify the key elements of Dwarf Mongoose movement during the optimization process. To evaluate the efficiency of the CDMO, ten different UCI datasets are used and compared against the original DMO and other well-known Meta-heuristic techniques, namely Ant Colony optimization (ACO), Whale optimization algorithm (WOA), Artificial rabbit optimization (ARO), Harris hawk optimization (HHO), Equilibrium optimizer (EO), Ring theory based harmony search (RTHS), Random switching serial gray-whale optimizer (RSGW), Salp swarm algorithm based on particle swarm optimization (SSAPSO), Binary genetic algorithm (BGA), Adaptive switching gray-whale optimizer (ASGW) and Particle Swarm optimization (PSO). The experimental results show that the CDMO gives higher performance than the other methods used in feature selection. High value of accuracy (91.9–100%), sensitivity (77.6–100%), precision (91.8–96.08%), specificity (91.6–100%) and F-Score (90–100%) for all ten UCI datasets are obtained. In addition, the proposed method is further assessed against CEC’2022 benchmarks functions.

Feature selection is one of the major steps in pattern recognition and classification since it aims to eliminate the redundant and irrelevant features within a dataset.It can be challenging to decide which features are useful without prior knowledge.As a result, numerous feature selection techniques are used to select the best features which give superior performance 1 .Particularly in applications, each dataset contains numerous significant numbers of features.The key objective of feature selection is to have a greater understanding of the methodology that produced the data in order to identify a subset of pertinent features from the vast pool of available features 2 .
There are two main types of feature selection techniques.First, filtering techniques that don't rely on learning algorithms but rather specific data attributes.In contrast, wrapper approaches evaluate the chosen subset of features using learning algorithms.Although wrapper methods are computationally expensive, they are more accurate than filter approaches 3 .In general, feature selection is typically a multi-objective optimization problem.Its two main goals are to reduce the feature space and gives high performance.When there is a tradeoff between these two objectives, which they frequently do, the best choice must be made 4 .
Recently, meta-heuristic optimization algorithms are frequently used for finding the most discriminative features.The most methods that have been studied are Particle Swarm Optimization (PSO) 5 , Ant Colony Optimization (ACO) 6 , Genetic Algorithm (GA) 7 , Genetic Programming (GP) 8 , Simulated Annealing (SA) 9 , Differential Evolution (DE) 10 , Cuckoo Search (CS) 11 , Artificial Immune Systems Algorithm (AIS) 12 , Tabu Search (TS) 13 , and Whale Optimization algorithm (WOA) 14 .In other hand, there are studies including multi objective and its hybrid versions that have been published with these classical meta-heuristic algorithms.The theorem of No-Free-Launch (NFL) is the reason of studies multiplicity where no algorithm can give best solution for all problems, so there is always a probability to find better solution with new meta-heuristic algorithm, that's why there are hundreds of studies in this field 15 .
• Propose a new hybrid feature selection method called CDMO based on improving the performance of DMO using chaotic maps.• Evaluate the proposed CDMO method using ten UCI datasets employing the K-nearest Neighbors (KNN)  as a classifier to prove its effectiveness.• The results obtained by the proposed CDMO give superior performance than the original DMO algorithm and with other well-known meta-heuristic-based feature selection methods.• On the CEC'22 test suite, the effectiveness and solution quality generated by our proposed method are com- puted and compared by all 9 chaotic maps and compared with state-of-the-art algorithms.
The rest part of this study is organized as follows: Section "Background" presents background on DMO algorithm and chaotic maps.Section "The proposed CDMO for feature selection" explains the proposed model.Experimental results and analysis are discussed in Section "Experimental results".Finally, the conclusion is summarized in Section "Conclusion and future work".

Background Dwarf Mongoose Optimization Algorithm (DMO)
DMO 27 is a meta-heuristic method that simulates the foraging behavior of the dwarf mongoose that uses its compensatory behavioral adaptations.The mongoose has two main compensatory behavioral adaptations, which are: Large prey items, which could provide food for the whole group, are not amenable to capture by dwarf mongooses.Due to the lack of a killing bite and organized pack hunting, the dwarf mongoose has evolved a social structure that allows each individual to survive independently and move from one location to another.The dwarf mongoose lives a semi-nomadic lifestyle in an area big enough to accommodate the entire colony.Because no previously visited sleeping mound is returned, the nomadic lifestyle ensures that the entire territory is explored and prevents over-exploitation of any one area 27 .

Population initialization
The candidate populations of the mongooses (X) are initialized using Eq.(1).Between the upper bound (UB) and lower bound (LB) of the given problem, the population is generated stochastically.
where X is the populations, created at random by Eq. ( 2), x i,j stands for the location of the jth dimension in the ith population, n stands for population size, and d stands for the problem dimension.
where rand is a random number between [0, 1], VarMax and VarMin are upper and lower bound of the problem.The best solution over iteration is the best-obtained solution so far.
The fitness of each solution is calculated after the population has been initiated.Equation (3) calculates the probability value for each population fitness, and the alpha female (α) is chosen based on this probability.
The n-bs is equal to the number of mongooses in the alpha group.Where bs represents the number of nannies.Peep is the alpha female's vocalization that directs the family's path.
The DMO applies the formula from Eq. ( 4) to provide a candidate food position.
where phi is a uniformly distributed random number [− 1,1], after each iteration, the sleeping mound is specified as in Eq. ( 5).
The average value of the sleeping mound found is given by Eq. ( 6).
The mongooses are known to avoid returning to the previous sleeping mound, so the scouts search for the next one to ensure exploration.The scout mongoose is simulated by Eq. ( 7).
where, CF = (1 − iter Max iter ) Max iter indicates the variable, which decreases linearly with each iteration, that controls the group's collective-volatile movement.
is the vector that controls the mongoose's movement to its new sleeping mound.

Chaotic maps
Chaos is a phenomenon that can exhibit non-linear changes in future behavior when its initial condition is even slightly altered.Additionally, it is described as a semi-random behavior generated by nonlinear deterministic systems 28 .One of main search algorithms is Chaos Optimization Algorithm (COA) which moves variables and parameters from the chaos to the solution space.It relies on determining the global optimum for stochastic, regular, and periodicity chaotic motion properties.Due to its simplicity and speedily convergence, COA has widely used in last ten years in many papers e.g., [29][30][31][32] .To obtain the chaotic sets, we have used ten well known one-dimensional maps that have been used frequently in literature.Figure 1 shows that the maps have different behaviors which allow testing the behavior of DMO on different maps. (1) The proposed CDMO for feature selection In this study, an alternative feature selection technique is proposed using the Chaotic Dwarf Mongoose Optimization (CDMO) as in Fig. 2. Random numbers which are used in Eq. ( 7) are replaced by chaotic maps to avoid returning to same sleeping mound.
where ρ is value obtained from well-known chaotic maps which reported in Table 1.After that, we have set the dimension of the problem, which is d in Eq. ( 1) as the number of features then give value of VarMin and VarMax in Eq. (2) as 0 and 1, respectively.For each row in Eq. (1) (i.e., the position of each element in X i ) is threshold by 0.5, since the values are set between 0 and 1.After that, elements with positions > 0.5 are considered as candidate features, while elements with positions < 0.5 are not considered in this solution.
The candidate features are then applied to the fitness function which calculates the classification accuracy of k-nearest neighbor classifier using the applied candidate features.
(9) X i,j = 1 x i,j > 0.5 0 Otherwise www.nature.com/scientificreports/Each time the fitness function is invoked the dataset is divided using the holdout method to 80% training dataset and 20% testing dataset.Algorithm 1 and Fig. 2 show the algorithm and the flowchart of the proposed technique, respectively.

Algorithm 1
Steps of the developed method.

Dataset and parameters setting
Table 2 lists the 10 datasets that were used in this study which are come from the well-known UCI data warehouse 33 .They have been chosen with different dimensions and different patterns to evaluate the performance of the proposed method on several complexities.

Performance metrics
In this study we have used two types of metrics to evaluate the performance which are Fitness metrics and classification Metrics.
In fitness metrics we have used four statistical measurements which are the worst, best, mean fitness value and the standard deviation which are mathematically defined as following where BS is the best score gained in each iteration and Nr is the number of runs 35 .
The second evaluation was used to evaluate the selected features using classification measures.These measures are accuracy, precision, sensitivity, specificity, and F-Score.Accuracy is a common technique of evaluation, which is defined as the ratio of correctly classified samples to all samples.It's mathematically defined as following Precision, specificity and sensitivity are proper metrics to measure the performance of classification across unbalanced datasets.While they are not affected by differences in data distribution, therefore these measures are useful for evaluating classification performance in unbalanced learning scenarios 36 .The F-Score metric make combination between precision and sensitivity and it is given by Eq. (19).Therefore, F-Score is suitable in unbalanced scenarios than the accuracy metric.Precision, sensitivity, specificity and F-score measures are defined by the following equations:

Performance of DMO based on ten chaotic maps
To evaluate the performance of the proposed CDMO, 10 different datasets from UCI repository are used.The obtained results are compared with the DMO and other well-known meta-heuristic algorithms namely, PSO 5 , ACO 6 , ARO 37 , HHO 38 , EO 39 , RTHS 40 , RSGW 41 , SSAPSO 42 , BGA 43 and WOA 14 algorithms.Each one of them has been performed 25 runs in the same PC specifications.To test the convergence capability, the average 25 runs has been computed and compared for each algorithm.Table 3 illustrates the parameter settings of the algorithms used in this study.The experiments are divided into two sections, the first one is to evaluate the performance of the ten chaotic maps on DMO algorithm as shown in Tables 4 and 5, the second experiments are to compare the best chaotic maps with the six meta-heuristic algorithms DMO, ACO, PSO, ARO, HHO, and WOA as shown in Tables 6 and 7.
Table 4 shows the accuracy of the average runs for the ten CDMO where the number after CDMO refers to the map number in Table two datasets named (base_exactly) and (base_M-of-n3).Table 5 shows the comparison of average fitness value of the ten chaotic maps.The Singer map (CDMO8) achieved best results in 5 out of 10 datasets.Both CDMO4 and CDMO6 achieved same result in base_M-of-n3.Also, CDMO1, CDMO3, CDMO5, CDMO7, CDMO10 have best results in one dataset for each, so CDMO8 has been chosen to be compared with ACO, PSO, WOA, ARO, HHO and DMO algorithms.Figure 3 illustrates the convergence curves for the ten chaotic maps.In this figure, the number of iterations is equal to 100.As it can be observed from this figure, almost singer map obtains best result.This is due to that it converges faster than other maps.

Comparison with other meta-heuristic techniques
In this section, we will compare the performance of the developed method based on Singer map with well-known and most used techniques named PSO, ACO, ARO, HHO and WOA.
From Table 6, the CDMO gives best accuracy in seven datasets (base_BreastEW, SonarEW, SpectEW, Waveform, CongressEW, breastEW and Ionosphere) while DMO gives superior performance in one data set named KrvskpeEW.Moreover, DMO and CDMO give equal performance in 2 datasets (base_M-of-n3 and base_exactly).Based on the results of Precision, CDMO8 has better results in seven datasets.Whereas DMO has better results in one dataset named BreastEW, both CDMO8 and DMO have same results in two datasets.By analysis of the obtained results of the Sensitivity, the CDMO8 has highest results of four datasets, while DMO and PSO have highest results in three datasets and one dataset, respectively.Moreover, both CDMO8 and DMO have same results in two datasets named base_exactly and base_M-of-n3.For specificity results, CDMO8 has highest results in seven datasets while PSO has best results in only one dataset named BreastEW.Besides, both CDMO8 and DMO have same results in two datasets.In addition, F-measure results show that CDMO8 has better results in five datasets while DMO has better result in KrvskpEW dataset and ARO has better result in SpectEW and ionosphere datasets, both CDMO8 and DMO have same results in two datasets.
Table 7 presents the results of fitness metrics which is standard deviation SD, Best, Worst and the Average of fitness function.In the average of fitness function, the CDMO8 achieved best results in 9 out of 10 datasets while ACO has best results in Ionosphere dataset only.In terms of best measure, the CDMO8 has best results in 5 out 10 datasets while the original DMO has best results in 2 out of 10 datasets, ARO has better value in ionosphere and base_M-of-n3 datasets both CDMO8 and DMO have same results in breastEW dataset.Furthermore, for Worst measure, CDMO8 has best results in 5 out of 10 datasets, while PSO has the second rank by 3 out of 10 datasets.WOA and DMO have highest results in one dataset for each.Additionally concerning standard deviation, WOA has the superior results by 7 out of 10 datasets, neither CDMO nor original DMO got best results in standard deviation results.
Figure 4 shows the comparison between CDMO8 and other meta-heuristic algorithms (i.e., PSO, ACO, DMO, ARO, HHO and WOA) in convergence curve.As observed from figure, CDMO8 converges faster in most figures.
Table 8 compares the accuracy of CDMO8 against 6 state-of-the-art methods namely, BGA, RTHS, RSGW, EO, SSAPSO and HSGW.It is clear that our proposed CDMO method stands at the top over these methods.CDMO8 produces higher accuracy in 8 out 10 datasets.

Performance evaluation on CEC'22 benchmark functions
In this section, the performance of the proposed CDMO algorithm in solving optimization problems is tested.To this end, the numerical solving efficiency of CDMO is evaluated by solving twelve functions of CEC'22.The performance of the proposed CDMO on the CEC'22 benchmark function has been determined.Table 9 presents the outcomes for a CEC'2022 test suite for 30 runs performed by the proposed ten chaotic DMO.These benchmark functions consist of four types unimodal, basic, hybrid and composite functions.It is found that CDMO9 achieves the best performance.
In order to verify the effectiveness of CDMO9, the results of the proposed CDMO9 are compared, in Table 10, with six novel optimization algorithms namely, Artificial Hummingbird Algorithm (AHA) 44 , African Vultures Optimization Algorithm (AVOA) 45 , Crow Search Algorithm (CSA) 46 , Harris Hawks Optimization (HHO) 38  www.nature.com/scientificreports/Northern Goshawk Optimization (NGO) 47 and Satin Bowerbird Optimizer (SBO) 48 .Besides, in order to demonstrate the ability of CDMO9 to solve optimization problems, the obtained results are compared with two algorithms recently improved by scholars namely, an adaptive quadratic interpolation and rounding mechanism Sine Cosine Algorithm (ARSCA) 49 and boosting Archimedes Optimization Algorithm using trigonometric operators (SCAOA) 50 .The experimental results show that the proposed method compares favorably with these methods.

Conclusion and future work
Chaotic Dwarf Mongoose Optimization Algorithm (CDMO) was proposed which is Dwarf Mongoose algorithm hybridized by chaos.To enhance the performance of the proposed technique, ten chaotic maps were employed where CDMO is used as a wrapper feature selector.The CDMO gives superior performance than the well-known meta-heuristic algorithms, namely PSO, ACO, WOA, ARO, HHO BGA, RTHS, RSGW, EO, SSAPSO, HSGW and DMO.The obtained results proved that the capability of CDMO to select the best feature set gives high classification results.Moreover, the experimental results proved that the adjusted variable using the Singer map

Figure 4 .
Figure 4. Comparison between best chaotic map and 6 meta-heuristic algorithms.

Table 2 .
Datasets used in this study.
1, for example CDMO1 is Chebyshev map.Results in Table4shows that the Singer map which is CDMO8 has higher results in three datasets named (breastEW, SpectEW, Waveform), CDMO1 and CDMO7 have best results in (KrvskpEW) and (Ionosphere), respectively.All maps have same accuracy in

Table 4 .
, Accuracy comparison between ten CDMO.Significant values are in bold.

Table 5 .
Average fitness comparison between ten CDMO.Significant values are in bold.

Table 6 .
Comparison between CDMO8 and 6 meta-heuristic algorithms in classification metrics.Significant values are in bold.

Table 7 .
Comparison between CDMO8 and 6 meta-heuristic algorithms in fitness metrics.Significant values are in bold.

Table 8 .
Comparison of CDMO8 with other 6 state-of-the-art methods based on achieved accuracy (highest classification accuracies are in bold).

Table 9 .
Comparison of simulation outcomes using DMO with 10 chaotic maps for a CEC'2022 test suite for 30 runs.

Table 10 .
Comparison of simulation outcomes for a CEC'2022 test suite for 30 runs (highest classification accuracies are in bold).